In a video conference system, when audio and video streams captured by a sender's endpoint are sent to a receiver's endpoint, the receiver's endpoint relies on timestamps of the audio and video streams to synchronize the audio stream to the video stream.